Database retrieval system and computer-readable storage medium storing a program for database retrieval

ABSTRACT

There is disclosed an information retrieval system which eliminates inconveniences associated with using a plurality of databases at the same time. A target database-extracting device extracts databases containing data as narrowing conditions. An integrated information retrieval device integrates data separately added to identical records in a plurality of databases in response to retrieval conditions input, and retrieves records matching the retrieval conditions based on integrated information of the data. A systematic information retrieval device is responsive to a systematic information retrieval command in which a particular record is designated, for retrieving, from the databases, other records systematically close to the designated particular record. A retrieval result display device displays results of the retrieval by the integrated retrieval device and results of the retrieval by the systematic information retrieval device.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to a database retrieval system anda computer-readable storage medium that stores a program for databaseretrieval, and more particularly to a database retrieval system thatretrieves information from a plurality of databases and acomputer-readable storage medium that stores a program for retrievinginformation from the databases.

[0003] 2. Description of the Related Art

[0004] Many researchers of natural science usually develop their studywith reference to existing data or data accumulated heretofore.Especially, in the field of chemistry, reference is often made tophysical properties of an immense number of substances to create acompound having new properties. To carry out efficient research in thedevelopment of a new material, databases storing information concerningphysical properties of substances are frequently used.

[0005] These databases include in-house databases developed byresearchers or users of research laboratories, and databases provided bydatabase vendors. These databases differ in contents of data storedtherein. For instance, there can be a case where one database containsonly values of electric conductivity and refractive indexes ofsubstances. If the user wishes to obtain information concerningtransparency and permittivity of the substances, he has to retrieveinformation from another database. Therefor, it is necessary for oneresearch organization to use several databases. Further, the environmentrequired for each search and retrieval varies from database to database,and hence the user utilizing a plurality of databases is required toselectively set up a suitable environment database by database.

[0006] To carry out retrieval of information from the differentdatabases, query expressions are prepared for the respective databasesand the retrieval of information is carried out using these expressions.Results of the retrievals obtained from all of the required databasesare manually arranged in order to obtain a comprehensive listing ofresults of the research being performed.

[0007] However, it is an extremely time consuming operation for the userto prepare query expressions and arrange the results of the retrieval inorder. Moreover, databases may not store a comprehensive set of thevalues of physical properties required by the user. This presents thefollowing problems for the user:

[0008] First, when data stored on several databases in a mannerspreading thereacross is narrowed down by a query expression to obtainnecessary information, data that ought to match the query expression ifthe data were stored in a single database can be left out of the resultsof the query. Let it be assumed, for instance, if one database containsdata concerning electric conductivity and refractive indexes ofsubstances while another database contains data concerning refractiveindexes and thermal conductivity of substances. When the search iscarried out by the query expression, “electric conductivity” AND (alogical multiplication) “thermal conductivity,” the query will return nodata because neither database contains information fulfilling bothretrieval conditions. Therefore, it is impossible to obtain informationfulfilling both of the two conditions of “electric conductivity” and“thermal conductivity” from the separate databases.

[0009] Second, when the database does not store the specific data neededby the user, there is no available means to predict values for the data.For example, if data for electric conductivity of a substance ismissing, it is possible to predict the electric conductivity of thesubstance using electric conductivity of another substance havingsimilar physical properties. However, information documenting thesimilarities between the substances that can be used in making this kindof prediction is so diverse that it is difficult for individual users todetermine the required similarities.

SUMMARY OF THE INVENTION

[0010] It is a first object of the invention to provide a databaseretrieval system that solves the problems encountered when a pluralityof databases are used in combination.

[0011] It is a second object of the invention to provide a databaseretrieval system that is capable of providing data for reference suchthat missing data can be predicted.

[0012] It is a third object of the invention to provide acomputer-readable storage medium storing a database retrieval programthat is capable of solving the problems encountered when a plurality ofdatabases are used in combination.

[0013] It is a fourth object of the invention to provide acomputer-readable storage medium storing a database retrieval programthat is capable of providing data for reference such that missing datacan be predicted.

[0014] To attain the first object, according to a first aspect of theinvention, there is provided a database retrieval system for carryingout information retrieval from a plurality of databases, comprisingintegrated information retrieval means responsive to retrievalconditions input, for integrating data separately added to identicalrecords in a plurality of databases and retrieving records matching theretrieval conditions based on integrated information of the data.

[0015] To attain the second object, according to a second aspect of theinvention, there is provided a database retrieval system for carryingout information retrieval from a database, comprising systematicinformation retrieval means responsive to a systematic informationretrieval command in which a particular record is designated, forretrieving, from the database, other records systematically close to thedesignated particular record.

[0016] To attain the third object, according to a third aspect of theinvention, there is provided a computer-readable storage medium storinga program for retrieving information from a plurality of databases, theprogram controlling a computer to function as integrated informationretrieval means responsive to retrieval conditions input, forintegrating data separately added to identical records in a plurality ofdatabases and retrieving records matching the retrieval conditions basedon integrated information of the data.

[0017] To attain the fourth object, according to a fourth aspect of theinvention, there is provided a computer-readable storage medium storinga program for retrieving information from a database, the programcontrolling a computer to function as systematic information retrievalmeans responsive to a systematic information retrieval command in whicha particular record is designated, for retrieving, from the database,other records systematically close to the designated particular record.

[0018] The above and other objects, features and advantages of thepresent invention will become apparent from the following descriptionwhen taken in conjunction with the accompanying drawings whichillustrate a preferred embodiment of the present invention by way ofexample.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019]FIG. 1 is a diagram showing the principle of a database retrievalsystem according to the invention;

[0020]FIG. 2 is a diagram showing a classified table of substances;

[0021]FIG. 3 is a diagram showing the whole arrangement of the databaseretrieval system according to an embodiment of the invention;

[0022]FIG. 4 is a diagram showing a data storage table 111;

[0023]FIG. 5 is a diagram showing a data storage table 121;

[0024]FIG. 6 is a diagram showing an item-table lookup table;

[0025]FIG. 7 is a diagram showing an inter-DB lookup table;

[0026]FIG. 8 is a diagram showing a primary storage table;

[0027]FIG. 9 is a diagram showing a secondary storage table;

[0028]FIG. 10 is a diagram showing a substance classification table;

[0029]FIG. 11 is a diagram showing a retrieval condition-designatingscreen;

[0030]FIG. 12 is a diagram showing a retrieval result display screen;

[0031]FIG. 13 is a flowchart showing an integrated information retrievalprocess;

[0032]FIG. 14 is a flowchart showing a systematic information retrievalprocess;

[0033]FIG. 15 is a diagram which is useful in explaining the advantagesof integration of databases;

[0034]FIG. 16 is a diagram showing an example of information retrievalseparately carried out on a plurality of databases storing data ofphysical properties of substances;

[0035]FIG. 17 is a diagram showing an example of information retrievalcarried out on the databases storing data of physical properties in anintegrating fashion;

[0036]FIG. 18 is a first diagram which is useful in explaining theadvantage of the systematic information retrieval;

[0037]FIG. 19 is a second diagram which is useful in explaining theadvantage of the systematic information retrieval; and

[0038]FIG. 20 is a third diagram which is useful in explaining theadvantage of the systematic information retrieval.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0039] The invention will now be described in detail with reference todrawings showing a preferred embodiment thereof.

[0040] Referring first to FIG. 1, there is illustrated a principle of adatabase retrieval system according to the invention, which retrievesinformation from a plurality of databases 11 to 13 storing differentcontents of data.

[0041] Target database-extracting means 1 is responsive to retrievalconditions input, for extracting databases containing data that are tobe narrowed down for information retrieval. For instance, assuming thatthe databases 11 to 13 store data of values of various physicalproperties of substances, if a retrieval condition of “3<electricconductivity<6” is input, the target database-extracting means 1extracts only databases storing data of electric conductivity toretrieve information therefrom. This makes it possible to carry outefficient retrieval of information without wasteful processingoperations.

[0042] Integrated information retrieval means 2 integrates dataseparately added to substantially identical records in the databases 11to 13, and retrieves records matching retrieval conditions from thedatabases based on the integrated data.

[0043] Systematic information retrieval means 3 retrieves, in responseto a systematic information retrieval command in which a specific recordis designated, another record systematically close to the designatedrecord.

[0044] Retrieval result display means 4 displays results of aninformation retrieval carried out by the integrated informationretrieval means 2 in response to inputting of retrieval conditions, andresults of a retrieval carried out by the systematic informationretrieval means 3 in response to a systematic information retrievalcommand, on a screen of a display device. It should be noted that inthis screen there is provided an area for selecting a record to beretrieved, and a systematic information retrieval command designatingthe record selected from this screen can be input to the systematicinformation retrieval means 3.

[0045] When the user inputs retrieval conditions to the databaseretrieval system, the target database-extracting means 1 extracts targetdatabases to be searched for information. Then, the integratedinformation retrieval means 2 integrates data of the target databases,and retrieves records matching the input retrieval conditions (e.g.substances having predetermined values of physical properties). Resultsof the retrieval are displayed on the screen of the display device bythe retrieval result display means 4.

[0046] This enables the user to retrieve information from all thedatabases to be searched, by inputting a single set of retrievalconditions. What is more, the retrieval process is carried out using theintegrated data of the databases, it is possible to retrieve data whichcan not be retrieved if the databases are searched individually orseparately.

[0047] If a retrieved record has missing data, and the user necessitatesthe missing data, inputting is carried out to execute a systematicinformation retrieval by selecting the record. In response to the input,the systematic information retrieval means 3 retrieves a recordsystematically close to the designated record. Results of the retrievalare displayed on the screen of the display device by the retrievalresult display means 4.

[0048] Thus, information or data of a record systematically close to therequired data of the designated record is displayed for reference, whichenables the user to predict the missing data or value of the designatedrecord from the information of the systematically close to the recorddisplayed.

[0049] It should be noted that systematic closeness between records canbe represented by a degree of similarity between information of therecords. The method of determining a systematic similarity will beexplained by taking polymers as an example.

[0050] To determine a systematic similarity, first, a degree ofsimilarity is calculated by comparing substances with each other. In thecase of polymers, substances can be classified according to a pluralityof methods. Therefore, if substances are classified into an identicalclass by the same classifying method, they are determined to have asimilarity. More specifically, the degrees of similarity are expressedin numerical values by using a classified table of substances.

[0051]FIG. 2 shows a classified table of substances. This table containsitems of classification and classified substances corresponding to therespective classified items.

[0052] In the illustrated example, the substances are classified by thefour classifying methods; “classification according to generation,”“classification according to structure,” “classification according tomonomer composition and a manner of combination of monomers” and“classification according to synthesis method.” Some classifying methodsclassify the substances in a hierarchical manner. For example, a class“copolymer” in the “classification according to manners and a manner ofcombination of monomers,” is further divided into subclasses of “randomcopolymer,” “alternating copolymer,” “graft copolymer” and “blockcopolymer.”

[0053] Now, the degrees of the similarity of substances to a substance Awill be considered. When the attention is paid to “classificationaccording to generation,” the substance A and a substance C belong tothe same class, and hence have a similarity. Therefore, the similaritybetween the substance A and the substance C gains one point. Then, whenthe attention is paid to “classification according to structure,” thesubstance A and the substance B belong to the same class, i.e. has asimilarity. Therefore, the similarity between the substance A and thesubstance B gains one point. Further, when the attention is paid to“classification according to monomer composition and a manner ofcombination of monomers,” the substance A and the substance B belong tothe same class, i.e. has a similarity. Therefore, the similarity betweenthe substance A and the substance B gains one point. As a result, thesimilarity between the substance A and the substance B gains two points.The “classification according to synthesis method” gives no substancesbelonging to the same class. Therefore, the similarity between thesubstance A and the substance B has two points, and the substance A andthe substance C has one point, so that the substance B is the mostsimilar to the substance A of all the substances.

[0054] Thus, points of similarity between every possible combination orpair of all substances are determined. Now, let it be assumed that thesimilarity between two substances (substance i and substance j) is equalto e_(ij).

[0055] Next, the multivariate analysis is applied to the abovesimilarity and projects results of the analysis onto a coordinate systemto thereby rank the substances in the order of similarity. Morespecifically, all the substances are given respective positions inone-dimensional system whereby the similarities between substances arerepresented by distances between the positions (coordinates) of thesubstances on the one-dimensional coordinate system. The relationshipbetween the positions of the substance i (coordinate value α_(i)) andthe substance j (coordinate value α_(j)) on the one-dimensionalcoordinate system is expressed by a numerical value obtained by thefollowing expression:

−(α_(i)−α_(j))²  (1)

[0056] The numerical value obtained by the expression (1) is called“similarity in terms of distance.”

[0057] To obtain the coordinate value α_(i), it is only required toincrease the correlation between the e_(ij) and the expression (1) to ahighest degree. For simplicity, the following description is made on acase where four substances are ranked in the order of similarity.

[0058] First, from the definition of a coefficient of correlation, thecoordinate values α₁, α₂, α₃, and α₄ are determined such that the innerproduct Q of vectors of the similarity e_(ij) (e₁₂, e₁₃, e₁₄, e₂₃, e₂₄,e₃₄) and vectors of similarity in terms of distance becomes the maximum:

Q=−e ₁₂(α₁−α₂)² −e ₁₃(α₁−α₃)² −e ₁₄(α₁−α₄)² −e ₂₃(α₂−α₃)² −e ₂₄(α₂−α₄)²−e ₃₄(α₃−α₄)²  (2)

[0059] However, if each α_(i) is multiplied by k, the value Q becomesmultiplied by k², which means that the value of Q can be increased toany larger value. Therefore, the following conditions are given:

α₁ ²+α²+α₃ ²+α₄ ²=1  (3)

[0060] This converts the above equation into the constrained extremevalue problem to which the method of Lagrange's multiplier can beapplied. $\begin{matrix}\begin{matrix}{F = \quad {{- {e_{12}\left( {\alpha_{1} - \alpha_{2}} \right)}^{2}} - {e_{13}\left( {\alpha_{1} - \alpha_{3}} \right)}^{2} - {e_{14}\left( {\alpha_{1} - \alpha_{4}} \right)}^{2} -}} \\{\quad {{e_{23}\left( {\alpha_{2} - \alpha_{3}} \right)}^{2} - {e_{24}\left( {\alpha_{2} - \alpha_{4}} \right)}^{2} - {e_{34}\left( {\alpha_{3} - \alpha_{4}} \right)}^{2} -}} \\{\quad {\lambda \left( {\alpha_{1}^{2} + \alpha_{2}^{2} + \alpha_{3}^{2} + \alpha_{4}^{2} - 1} \right)}}\end{matrix} & (4)\end{matrix}$

[0061] In the above equation, F is partially differentiated by α₁, α₂,α₃, and α₄ and the results are set to 0 (λ represents an eigenvalue),whereby the following equations (5) to (8) can be obtained:$\begin{matrix}{\frac{\partial F}{\partial\alpha_{1}} = {{2\left\{ {{- {e_{12}\left( {\alpha_{1} - \alpha_{2}} \right)}} - {e_{13}\left( {\alpha_{1} - \alpha_{3}} \right)} - {e_{14}\left( {\alpha_{1} - \alpha_{4}} \right)} - {\lambda\alpha}_{1}} \right\}} = 0}} & (5) \\{\frac{\partial F}{\partial\alpha_{2}} = {{2\left\{ {{e_{12}\left( {\alpha_{1} - \alpha_{2}} \right)} - {e_{23}\left( {\alpha_{2} - \alpha_{3}} \right)} - {e_{24}\left( {\alpha_{2} - \alpha_{4}} \right)} - {\lambda\alpha}_{2}} \right\}} = 0}} & (6) \\{\frac{\partial F}{\partial\alpha_{3}} = {{2\left\{ {{e_{13}\left( {\alpha_{1} - \alpha_{3}} \right)} + {e_{23}\left( {\alpha_{2} - \alpha_{3}} \right)} - {e_{34}\left( {\alpha_{3} - \alpha_{4}} \right)} - {\lambda\alpha}_{3}} \right\}} = 0}} & (7) \\{\frac{\partial F}{\partial\alpha_{4}} = {{2\left\{ {{e_{14}\left( {\alpha_{1} - \alpha_{4}} \right)} - {e_{24}\left( {\alpha_{2} - \alpha_{4}} \right)} + {e_{34}\left( {\alpha_{3} - \alpha_{4}} \right)} - {\lambda\alpha}_{4}} \right\}} = 0}} & (8)\end{matrix}$

[0062] When these equations are rearranged by using the coordinatevalues α₁, α₂, α₃, and α₄, the following simultaneous equations can beobtained: $\begin{matrix}\left\{ \begin{matrix}{{{\left( {{- e_{12}} - e_{13} - e_{14} - \lambda} \right)\alpha_{1}} + {e_{12}\alpha_{2}} + {e_{13}\alpha_{3}} + {e_{14}\alpha_{4}}} = 0} \\{{{e_{12}\alpha_{1}} + {\left( {{- e_{12}} - e_{23} - e_{24} - \lambda} \right)\alpha_{2}} + {e_{23}\alpha_{3}} + {e_{24}\alpha_{4}}} = 0} \\{{{e_{13}\alpha_{1}} + {e_{23}\alpha_{2}} + {\left( {{- e_{13}} - e_{23} - e_{24} - \lambda} \right)\alpha_{3}} + {e_{34}\alpha_{4}}} = 0} \\{{{e_{14}\alpha_{1}} + {e_{24}\alpha_{2}} + {e_{34}\alpha_{3}} + {\left( {{- e_{14}} - e_{24} - e_{34} - \lambda} \right)\alpha_{4}}} = 0}\end{matrix} \right. & (9)\end{matrix}$

[0063] The simultaneous equations (9) can be expressed in the followingmatrix: $\begin{matrix}{{\begin{bmatrix}\beta_{1} & e_{12} & e_{13} & e_{14} \\e_{12} & \beta_{2} & e_{23} & e_{24} \\e_{13} & e_{23} & \beta_{3} & e_{34} \\e_{14} & e_{24} & e_{34} & \beta_{4}\end{bmatrix}\quad\begin{bmatrix}\alpha_{1} \\\alpha_{2} \\\alpha_{3} \\\alpha_{4}\end{bmatrix}} = {\begin{bmatrix}0 \\0 \\0 \\0\end{bmatrix}\begin{pmatrix}{\beta_{1} = {{- e_{12}} - e_{13} - e_{14} - \lambda}} \\{\beta_{2} = {{- e_{12}} - e_{23} - e_{24} - \lambda}} \\{\beta_{3} = {{- e_{13}} - e_{23} - e_{34} - \lambda}} \\{\beta_{4} = {{- e_{14}} - e_{24} - e_{34} - \lambda}}\end{pmatrix}}} & (10)\end{matrix}$

[0064] Therefore, the above extreme value results in an eigenvalueproblem of a symmetric matrix, from which the eigenvalue λ can bedetermined. In this case, there are obtained a plurality of values ofthe eigenvalue λ.

[0065] Now, from the equations (4) to (8), there is obtained thefollowing equation:

F=λ  (11)

[0066] From this, it is understood that the maximum value of F, i.e. themaximum value of the inner product Q is given by the eigenvalue λ.Therefore, the maximum eigenvalue λ is selected, and the coordinatevalues α₁, α₂, α₃, and α₄ are determined from the selected maximumeigenvalue λ. This value represents a coordinate value of each substanceprojected onto the coordinate system.

[0067] If all the substances are projected onto one-dimensionalcoordinate system, to find a substance which is systematically close toone substance, it is only required to select a substance which isadjacent to the one substance along the coordinate axis.

[0068] Now, the invention will be described in further detail based onan embodiment in which it is applied to a client/server system.

[0069] Referring to FIG. 3, there is shown a whole arrangement of thedatabase retrieval system according to the embodiment, which includes adatabase server 100, a client 200, and a client 200 a all connected viaa network.

[0070] The database server 100 contains a plurality of databases (DBs)110, 120, 130 and 140. The databases (DBs) 110, 120, 130 and 140 haverespective data storage tables 111, 121, 131 and 141. The databases 110,120, 130, and 140 are named “DB_A,” “DB_B,” “DB_C,” and “DB_D,”respectively.

[0071] A database integrating block 150 carries out a process forcausing the databases to function as if they were a single database(DB). That is, the database integrating block 150 can retrieveinformation from the databases (DBs) 110, 120, 130 and 140 in anintegrating fashion by using a plurality of database management tables.The database management tables include an item-table lookup table 151,an inter-DB substance lookup table 152, a primary storage table 153, asecondary storage table 154, and a substance classification table 155.

[0072] A database management system (DBMS) 101 accesses the database(DBs) 110, 120, 130, and 140 according to a query in the structuredquery language (SQL). A SQL-generating block 102 generates a SQL commandor query according to a request by the database integrating block 150,etc.

[0073] The client 200 has a user input/output control block 210 thatdisplays a retrieval condition-designating screen 211 or a retrievalresult display screen 212 on the display device, and analyzesinformation provided by an input device based on the contents of any ofthe screens to generate a request to the database server 100 based onthe analyzed information. The retrieval condition-designating screen 211enables the user to input a query expression or retrieval command. Theretrieval result display screen 212 displays results of an integratedretrieval, and is used for inputting a command for a systematicinformation retrieval when the user intends to carry out the systematicinformation retrieval. The client 200 is also provided with anSQL-generating block 201 that generates a SQL command or query asrequired.

[0074]FIG. 4 shows a data storage table 111. The data storage table 111contains values of physical properties of substances, which areregistered therein under respective items indicative of the physicalproperties (“physical property A,” “physical property B,” . . . ) in amanner correlated to substance IDs in the database DB_A. The substanceIDs of the database DB_A are identifiers uniquely assigned to therespective substances, and enables the substances to be discriminatedfrom each other only within the database DB_A 110.

[0075]FIG. 5 shows the data storage table 121. The data storage table121 contains values of physical properties of substances, which areregistered therein under respective items indicative of the physicalproperties (“physical property a,” “physical property b,” . . . ) in amanner correlated to substance IDs in the database DB_B. The substanceIDs of the database DB_B are identifiers uniquely assigned to therespective substances, and enables the substances to be discriminatedfrom each other only within the database DB_B 120. Thus, the datastorage tables of these databases are separately organized, andtherefore they have different substance IDs assigned to the samesubstances. Further, physical properties registered are also differentfrom database to database. This is also the case with the other datastorage tables 131 and 141.

[0076]FIG. 6 shows an item-table lookup table 151. The item-table lookuptable 151 contains items representative of physical properties ofsubstances and names of data storage tables storing values of thephysical properties of the substances. For instance, assuming that“electric conductivity” is entered as the name of an item, names of thedatabases (DBs) storing data of the electric conductivity are registeredas the names of data storage tables.

[0077] By using the item-table lookup table 151, from the names of itemscontained in a query expression, databases containing information on theitems can be selected whereby it is possible to search these databasesalone to retrieve information therefrom.

[0078]FIG. 7 shows an inter-DB substance lookup table 152. In theinter-DB substance lookup table 152, substance IDs assigned to thesubstances registered in the databases (DBs) are registered andcorrelated to total identifiers (Total IDs) uniquely assigned to thesubstances in the databases (DBs) in an integrating fashion. This makesit possible to know what IDs are assigned to a certain substance indifferent databases (DBs).

[0079]FIG. 8 shows a primary storage table 153. In the primary storagetable 153, there are registered total identifiers (Total IDs) ofsubstances which are retrieved by searching the databases in anintegrating fashion by using a query expression input by the user.

[0080]FIG. 9 shows a secondary storage table 154. In the secondarystorage table 154, there are registered values of physical propertiescorresponding to the total identifiers (Total IDs) registered in theprimary storage table 153.

[0081]FIG. 10 shows a substance classification table 155. In thesubstance classification table 155, items or classes into which eachsubstance designated by a total identifier (Total ID) is classified arechecked.

[0082]FIG. 11 shows a retrieval condition-designating screen 211. In theupper space of the retrieval condition-designating screen 211, there isprovided a viewing screen-designating box 211 a which is used indesignating a screen for viewing results of a retrieval from a pluralityof viewing screens available. In most cases, there is provided a menu ofviewing screens provided for the respective databases 110, 120, 130 and140 so as to enable selection of a suitable screen therefrom.

[0083] Below the viewing screen-designating box 211 a, there is provideda query expression-entering area 211 b. In the query expression-enteringarea 211 b, an upper limit value and a lower limit value can be enteredon opposite sides of the name of an item (a name of a physicalproperty). Further, the logical operand of “NOT” (for retrievingsubstances which do not match the conditions defined) can be alsodesignated. Further, when designating a plurality of conditions, it isdesignated whether the conditions should be joined by a logical AND(logical multiplication) operator or by a logical OR (logical sum)operator. If “AND/OR” is designated, a retrieval using a queryexpression in which the conditions are joined by the logical ANDoperator and a retrieval using a query expression in which theconditions are joined by the logical OR operator are simultaneouslycarried out. In the illustrated example, the maximum five search itemsor retrieval conditions can be entered.

[0084] At the upper right corner of this screen, there is provided aretrieval execution button 211 c. By depressing the retrieval executionbutton 211 c, the retrieval is carried out according to the queryexpression entered in the query expression-entering area 211 b.

[0085]FIG. 12 shows a retrieval result display screen 212. In theretrieval result display screen 212, there is provided a retrievalresult display area 212 a. The retrieval result display area 212 adisplays names of substances and values of physical properties of thedisplayed substances. The physical properties registered for eachsubstance in the searched databases are collectively shown in this area212 a. From this area, it is possible to designate as to each substancewhether or not the systematic information retrieval should be carriedout.

[0086] A systematic information retrieval execution button 212 b is abutton which is depressed when the systematic information retrieval isto be carried out. When this button is depressed, substances which aresystematically close to substances which are checked for the systematicinformation retrieval are additionally displayed in a retrieval resultdisplay area 212 a. This enables the user to predict missing values ofphysical properties of the retrieved substances by displaying values ofthe physical properties of substances which are systematically close tothe retrieved substances, for reference.

[0087] A screen selector button 212 c is depressed for switching thescreen over to another screen, e.g. a viewing screen provided foranother retrieval function. A message button 212 d is depressed fordisplaying messages from the database server 100.

[0088] The operation of the database retrieval system constructed abovewill be described hereinafter.

[0089]FIG. 13 is a flowchart illustrating the process steps for carryingout the integrated information retrieval process. The integratedinformation retrieval process is started in the client 200 when aprogram therefore is started.

[0090] [S1] The user input/output control block 210 of the client 200prompts the user to input retrieval conditions, and the database server100 analyzes the retrieval conditions input or received. Morespecifically, the user input/output control block 210 displays theretrieval condition-designating screen 211 on the display device, andthe user inputs a query expression to the query expression-entering area211 b from this screen. Then, the user depresses the retrieval executionbutton 211 c. This sends the input query expression as the retrievalconditions to the database server 100. The database integrating block150 of the database server 100 receives and analyzes the retrievalconditions.

[0091] [S2] The database integrating block 150 retrieve which databases(hereinafter referred to as “DBs”) contain information on items(physical properties) included in the retrieval conditions. Morespecifically, the database integrating block 150 looks up the item-tablelookup table 151 to extract names of data storage tables correlated tothe items included in the retrieval conditions.

[0092] [S3] The database integrating block 150 determines DBs to besearched. More specifically, the DBs storing the data storage tablesextracted at the step S2 are determined to be the DBs to be searched.

[0093] [S4] The database integrating block 150 cooperates with theSQL-generating block 102 to prepare a query in a structured querylanguage (SQL) to be sent to the DBMS 101. The query is formed such thatit contains all patterns suitable for the respective DBs to be searched.

[0094] [S5] The DBMS 101 executes the query. In this case, theinformation is retrieved according to the query from the DBs in theirintegrated state. The DBMS 101 returns query results.

[0095] [S6] The database integrating block 150 determines whether thereis any retrieved information. If there is retrieved information, theprogram proceeds to step S8, whereas if there is no retrievedinformation, the program proceeds to step S7.

[0096] [S7] The database integrating block 150 notifies the user thatthe query result does not contain any hit, followed by returning to thestep S1. More specifically, the database integrating block 150 sends apredetermined message to the client 200, and the user input/outputcontrol block 210 displays the message on the screen of the displaydevice.

[0097] [S8] The database integrating block 150 retrieves a totalidentifier (Total ID) of the retrieved substance, and stores theidentifier (Total ID) in the primary storage table 153. Morespecifically, the substance IDs returned from the DBs are converted tototal identifiers (Total IDs) by looking up the inter-DB substancelookup table 152. Then, the total identifiers (Total IDs) thus obtainedare stored in the primary storage table 153.

[0098] [S9] The database integrating block 150 retrieves values ofphysical properties of the substances from the DBs, and stores theretrieved values in the secondary storage table 154.

[0099] [S10] The user views the retrieved values of the physicalproperties of the substances displayed on the retrieval result displayscreen 212. More specifically, the database integrating block 150 sendscontents or values stored in the secondary storage table 154 togetherwith the names of the substances represented by the total identifiers(Total IDs) to the client 200. The user input/output control block 210opens the retrieval result display screen 212 and displays informationsent from the database integrating block 150 on the retrieval resultdisplay area 212 a. The user views the contents of the display.

[0100] [S11] The user input/output control block 210 determines whetheror not there are any values of physical properties of the substanceswhich cannot be viewed from the present retrieval result display screen212. That is, each retrieval result display screen 212 is intended onlyfor displaying results of the information retrieval from a correspondingone of the DBs, but displaying of physical properties which are notregistered in the corresponding DB is excluded out of consideration.Therefore, results of the information retrieval can contain values ofthe physical properties which cannot be displayed on the presentretrieval result display screen 212. Therefore, if there are any valuesof the physical properties which cannot be displayed on the presentscreen, the program proceeds to step S12, whereas if all the values ofthe physical properties are displayed, the program proceeds to step S13.

[0101] [S12] The user input/output control block 210 displays on thedisplay device a message to the effect that the query result containsvalues of physical properties which cannot be displayed on the presentretrieval result display screen 212. To view the values which cannot bedisplayed, it is required to designate a suitable viewing screen for aretrieval function adapted to another DB.

[0102] [S13] The user determines whether or not the systematicinformation retrieval function should be executed. If the systematicinformation retrieval function is to be executed, the program proceedsto step S14, whereas if the same is not to be executed, the presentprogram is immediately terminated.

[0103] [S14] The database integrating block 150 carries out thesystematic information retrieval function. The user input/output controlblock 210 displays results of the systematic information retrieval onthe display device of the client 200, followed by terminating theprogram.

[0104]FIG. 14 is a flowchart illustrating the process steps for carryingout the systematic information retrieval process.

[0105] [S21] From the retrieval result display screen 212, the userdesignates substances for which the systematic information retrievalshould be carried out. More specifically, the check box of a desired oneof the substances displayed on the retrieval result display screen 212is checked for the systematic information retrieval. Then, the userdepresses the systematic information retrieval execution button 212 b.In response to the depression, the user input/output control block 210sends a systematic information retrieval command to the database server100.

[0106] [S22] The database integrating block 150 retrieves a totalidentifier (Total ID) of the substance designated for the systematicinformation retrieval by looking up the secondary storage table 154.

[0107] [S23] The database integrating block 150 cooperates with theSQL-generating block 102 to prepare an SQL query.

[0108] [S24] The database integrating block 150 retrieves informationfrom the substance classification table 155 by passing the SQL query tothe DBMS 101.

[0109] [S25] The database integrating block 150 calculates a degree ofsimilarity between the designated substance and each of the othersubstances.

[0110] [S26] The database integrating block 150 obtains simultaneousequations of an eigenvalue λ and a value a representative of thesimilarity in terms of distance calculated by quantification of thesimilarity by applying the method of Lagrange's multipliers thereto.

[0111] [S27] The database integrating block 150 expresses thesimultaneous equations obtained at the step S26 in a matrix, whereby theeigenvalue 2 is determined.

[0112] [S28] The database integrating block 150 determines values aicorresponding to the substances projected onto the coordinate axis fromthe maximum value of the eigenvalue λ and the simultaneous equationsobtained at the step S26.

[0113] [S29] The database integrating block 150 stores the totalidentifier (Total ID) of a substance which has a value close to thevalue of the designated substance (closest or within a predeterminedthreshold value) among the values α_(i)on the coordinate axis, in theprimary storage table 153.

[0114] [S30] The database integrating block 150 detects values ofphysical properties of the substance from the DBs by using the totalidentifier (Total ID) of the substance stored in the primary storagetable 153, and stores the retrieved values in the secondary storagetable 154.

[0115] [S31] The database integrating block 150 displays contents of thesecondary storage table 154 together with the name of each substancerepresented by its Total ID to the client 200. The user input/outputcontrol block 210 additionally displays information received from thedatabase integrating block 150 on the retrieval result display area 212a of the retrieval result display screen 212.

[0116] Thus the integrated retrieval and the systematic informationretrieval are carried out. This gives the following advantageouseffects:

[0117] The integration of DBs is advantageous in that it becomespossible to retrieve information of substances which cannot be retrievedby searches separately carried out on the respective DBs.

[0118]FIG. 15 is a diagram which is useful in explaining theadvantageous effects obtained by integrated retrieval of the DBs. In theillustrated example, the data storage table 111 of the database “DB_A,”the data storage table 121 of the database “DB_B,” and the data storagetable 131 of the database “DB_C” are integrated into a data storagetable 103. In the data storage table 103 obtained by integrating theDBs, values of physical properties of substances registered in the DBsare organized as data in a single or integrated database. However, evenif the data storage table 103 is not actually prepared within thedatabase server, the integrated information retrieval from the DBscarried out in an DB-integrating fashion provides the same results asobtained when the information retrieval is carried out using the datastorage table 103.

[0119] Let it be assumed that the retrieval is carried out using a queryexpression of “3<electric conductivity<6 AND/OR 4<refractive index<8.”When this query expression is used in executing the informationretrieval from DBs individually or separately, the database “DB_A” givesresults of “AND: 0 hit, OR: {B, C}.” Similarly, the database “DB_B”gives results of “AND: 0 hit, OR: {C, D},” and the database “DB_C” givesresults of “AND: 0 hit, OR: {A, B, D}.”

[0120] On the other hand, when the information retrieval using the samequery expression is carried out using the data storage table 103obtained by the integration of the DBs, results of “AND: {C}, OR: {A, B,C, D”} are obtained. That is, a result “AND: {C”} which could not beobtained by the separate or individual retrievals is additionallyobtained.

[0121] Now, the advantageous effects of the integration of the DBs willbe described using the names of specific substances.

[0122]FIG. 16 shows an example of results of retrievals separatelycarried out on the respective DBs. In the illustrated example, the datastorage table 111 a of the database “DB_A,” the data storage table 112 aof the database “DB_B,” and the data storage table 113 a of the database“DB_C” contain values of physical properties of various substances, suchas “polyethylene terephthalate,” “polyamide 6,” “polybutyleneterephthalate,” “polypropylene,” “methacrylate resin,” etc.

[0123] The data storage table 111 a contains values of physicalproperties of “specific gravity (small),” “tensile rupture (small),”“bending strength (small)” and “bending modulus (small)” registeredtherein.

[0124] The data storage table 112 a contains values of physicalproperties of “specific gravity (small),” “tensile rupture (small),”“Izod impact strength (small),” “deflection temperature under load 18 k(large”) and “electric linkage temperature” registered therein.

[0125] The data storage table 113 a contains values of physicalproperties of “specific gravity (small),” “tensile rupture (small),”“tensile rupture (large),” “breaking extension (small)” and “breakingextension (large).”

[0126] Now, let it be assumed that the retrieval is carried out by theretrieval conditions of “1.2<specific gravity (small)<1.7 AND/OR950<tensile rupture (small)<1300.” If this information retrieval iscarried out on the DBs separately, the database “DB_A” gives results of“AND: 0 hit, OR: 1 hit {polyamide 6}.” Similarly, the database “DB_B”gives results of “AND: 0 hit, OR: 1 hit{polybutylene terephthalate},”and the database “DB_C” gives results of “AND: 0 hit, OR: 2 hits{polyamide 6, polybutylene terephthalate}.”

[0127] Such a retrieval cannot retrieve information of substancesfulfilling the logical AND of the above retrieval conditions. Therefore,the following retrieval is carried out by integrating the DBs.

[0128]FIG. 17 shows an example of the information retrieval carried outby integrating the databases (DBs) storing data of physical propertiesof substances. In the data storage table 103 a obtained by integratingthe DBs, values of physical properties in the DBs are organized as datain a single database. Now, let it be assumed that the data storage table103 a is searched using the retrieval conditions as shown in FIG. 16.This search gives results of “AND: 2 hits {polyamide 6, polybutyleneterephthalate}, OR: 2 hits {polyamide 6, polybutylene terephthalate}.”

[0129] Thus, by integrating the DBs, information of the two substancesfulfilling the logical AND of the above retrieval conditions can beretrieved.

[0130] It should be noted that in the present embodiment, when theintegrated retrieval is carried out, only the databases containing itemsdesignated in the retrieval conditions are searched. Therefore, theintegrated information retrieval process can be efficiently carried out.

[0131] The merit of the systematic information retrieval is that when asubstance has an missing value (null value) of a physical property,reference can be made to a value of another substance similar inphysical properties.

[0132]FIG. 18 is a first diagram which is useful in explaining theadvantage of the systematic information retrieval. In the figure, thereare shown sets of information generated at respective process steps bythe present system.

[0133] [S101] As shown in the table, there are a lot of missing valuesin the integrated databases (DBs). However, the user cannot directlyrecognize the omission of these data. Therefore, the user cannot preparea query expression which can overcome the inconvenience of omission ofthe data. So, now, let it be assumed that the user enters a queryexpression of “4<permittivity<12,” and the information retrieval iscarried out in response thereto.

[0134] [S102] Results of the information retrieval are displayed on thescreen of the display device of the client 200. In the illustratedexample, information of values of physical properties of the substancesD, E and Z is obtained. However, even if the user desires to comparethese substances fulfilling the above retrieval condition in respect ofpermittivity, the value of permittivity of the substance D is missing.Therefore, the systematic information retrieval is carried out for thesubstance D.

[0135] [S103] When a command for the systematic information retrieval isdelivered, the database integrating block 150 projects substances in theorder of systematic closeness therebetween onto a one-dimensionalcoordinate system, whereby a substance B is found which issystematically closest to the substance D.

[0136] [S104] Results of the systematic information retrieval are addedto the results of the integrated information retrieval, and displayed onthe screen of the display device of the client 200. By viewing thedisplayed results of the information retrieval, it is possible topredict the permittivity of the substance D from the value ofpermittivity of the substance B. That is, since these two substances aresystematically close to each other, it can be assumed that they havesimilar values of physical properties.

[0137] It should be noted that although in the above embodiment, onlythe systematically closest substance is extracted or retrieved, in somecases, the substance which is newly extracted by the systematicinformation retrieval does not have a value of the desired physicalproperty, either. To eliminate the inconvenience, the systematicinformation retrieval can be carried out by designating requiredphysical properties of which values are necessitated.

[0138]FIG. 19 is a second diagram which is useful in explaining theadvantage of the systematic information retrieval. In the figure, thereare shown sets of information retrieved at respective process steps whenthe physical properties of which values are necessitated are designatedin advance.

[0139] [S111] In the illustrated example, the DBs integrated aresearched by using a query expression of “3<electric conductivity<6 OR4<refractive index<5 OR 3<permeability<8.”

[0140] [S112] As results of the integrated information retrieval,information of values of physical properties of the substances B and Cis retrieved. However, even if the user desires to compare the retrievedsubstances in respect of the permittivity and refractive index, valuesof the permittivity and refractive index are missing in the retrievedinformation concerning the substances B and C. Therefore, the systematicinformation retrieval is carried out for the substances B and C. Indoing this, a condition of “the value of permittivity and that ofrefractive index are not null” is joined to a query expression orcommand of the systematic information retrieval by the logical AND.

[0141] [S113] In response to the command of the systematic informationretrieval, the database integrating block 150 projects the substances inthe order of systematic closeness therebetween onto the one-dimensionalcoordinate system, whereby among the substances having a value ofpermittivity and a value of refractive index, substances J and M arefound as closest to the substances B and C, respectively.

[0142] [S114] Results of the systematic information retrieval are addedto the results of the integrated information retrieval and displayed onthe screen of the display device of the client 200. From the displayedresults of the retrievals, the values of permittivity and refractiveindex of the substances B and C can be predicted from those of thesubstances J and M.

[0143] It should be noted that although in the illustrated example,information of only one substance which is systematically closest to asubstance having missing data is retrieved, this is not limitative, butthe system may be configured such that the user can set a thresholdvalue as desired to thereby retrieve information of all the substanceshaving physical property values falling within the threshold value. Oneexample will be given in the following:

[0144]FIG. 20 is a third diagram which is useful in explaining theadvantage of the systematic information retrieval. In the figure, thereare shown sets of information retrieved or generated at respectiveprocess steps carried out by the system.

[0145] [S121] In the illustrated example, the DBs integrated aresearched by using a query expression of “3<permittivity <6 OR4<refractive index<5 OR 3<permeability<8.”

[0146] [S122] In the illustrated example, information of values ofphysical properties of the substances B and C is retrieved. Even if theuser desires to know values of permittivity and refractive index of thesubstance C matching the above query expression, values of thepermittivity and refractive index are missing in the retrieved data ofthe substance C. Therefore, the systematic information retrieval iscarried out for the substance C. In doing this, a certain thresholdvalue for determining required systematic closeness is added to a queryexpression or command thereof.

[0147] [S123] In response to the command of the systematic informationretrieval, the database integrating block 150 projects the substances inthe order of systematic closeness therebetween onto the one-dimensionalcoordinate system, whereby substances J and Q are found which are withinthe threshold values of closeness from the substance C.

[0148] [S124] Results of the systematic information retrieval are addedto the results of the integrated information retrieval and displayed onthe screen of the display device of the client 200. From the displayedresults of the retrievals, the values of permittivity and refractiveindex of the substance C can be predicted from the values of thesephysical properties of the substances J and Q. In this case, values ofpredetermined physical properties of one substance can be predictedusing values of the physical properties of a plurality of substanceswhich are systematically close to the one substance, so that thedifference between the expected value and the actual value of eachrequired physical property can be minimized.

[0149] The above processing functions can be implemented by a computer.In such a case, details of the process which the database retrievalsystem is required to execute are written in a program recorded in acomputer-readable storage medium. By executing the program, the computerrealizes the processes described heretofore. The computer-readablestorage medium includes a magnetic recording device, a semiconductormemory, etc. The program may be made available on the market bydistributing portable recording media storing the program, such asCD-ROMs (Compact Disk Read Only Memories) and floppy disks, or bystoring the program in a storage device of a computer connected to anetwork, and permitting the program to be transferred to other computersvia the network. To execute the program by the computer, the program isstored in a hard disk drive or the like of the computer, and loaded intoa main memory for execution.

[0150] As described above, in the database retrieval system according tothe first aspect of the invention, data stored in a plurality ofdatabases are integrated and an information retrieval is carried outusing the integrated data. Therefore, it is possible to retrieveinformation from a plurality of databases by a similar procedure to theone taken when the information retrieval is carried out for a singledatabase, which simplifies the retrieval operation. Further, since theinformation retrieval is carried out on the integrated data orinformation, it is possible to retrieve records which are left out ofthe retrieval when the information retrieval is carried out individuallyor separately.

[0151] Further, in the database retrieval system according to the secondaspect of the invention, information is retrieved from other recordswhich are systematically close to a designated record, so that when anyof the retrieved records has missing data, the missing data can bepredicted from the retrieved information of other records systematicallyclose thereto.

[0152] Further the computer-readable storage medium according to thethird aspect of the invention stores a program that integrates data of aplurality of databases and carries out information retrieval using theintegrated data. Therefore, when the program is executed by a computer,the function of integrated information retrieval from a plurality ofdatabases with simple operations can be realized on the computer.

[0153] Further, the computer-readable storage medium according to thefourth aspect of the invention stores a program that carries outinformation retrieval from other records which are systematically closeto a designated record. Therefore, when the program is executed by acomputer, if a retrieved record has missing data, data suitable forreferrence in predicting the missing data can be displayed on the screenof the computer.

[0154] The foregoing is considered as illustrative only of theprinciples of the present invention. Further, since numerousmodifications and changes will readily occur to those skilled in theart, it is not desired to limit the invention to the exact constructionand applications shown and described, and accordingly, all suitablemodifications and equivalents may be regarded as falling within thescope of the invention in the appended claims and their equivalents.

What is claimed is:
 1. A database retrieval system for carrying outinformation retrieval from a plurality of databases, comprisingintegrated information retrieval means responsive to retrievalconditions input, for integrating data separately added to identicalrecords in a plurality of databases and retrieving records matching saidretrieval conditions based on integrated information of said data.
 2. Adatabase retrieval system according to claim 1 , wherein said integratedinformation retrieval means integrates values of physical propertiesseparately added to identical substances in said databases whenretrieval conditions defining values of physical properties ofsubstances are input, and retrieves substances matching said retrievalconditions based on integrated information of said values of saidphysical properties.
 3. A database retrieval system according to claim 1, further including target database-extracting means for extractingdatabases containing data as narrowing conditions, said integratedinformation retrieval means integrating data in said extracted databasesand retrieving records from said extracted databases.
 4. A databaseretrieval system according to claim 1 , further including systematicinformation retrieval means responsive to a systematic informationretrieval command in which a particular record is designated, forretrieving, from said databases, other records systematically close tosaid designated particular record.
 5. A database retrieval systemaccording to claim 4 , further including a display device having ascreen, and retrieval result display means for displaying data ofrecords retrieved by said integrated information retrieval means on saidscreen of said display device, as well as receiving an input forselecting a record to be searched by a systematic information retrieval,from said records said data of which are displayed on said screen, andwherein said systematic information retrieval means retrieves anotherrecord which is systematically close to said record selected via saidretrieval result display means.
 6. A database retrieval system accordingto claim 4 , wherein said systematic information retrieval meansdetermines degrees of similarity between individual records, andconverts said degrees of similarity to degrees of closeness on aone-dimensional coordinate system to thereby determine that as a recordis closer to another record on said one-dimensional coordinate system,said record is systematically closer to said another record.
 7. Adatabase retrieval system according to claim 5 , wherein when saiddatabases store information of values of physical properties ofsubstances, said systematic information retrieval means classifies saidsubstances according to various criteria, and determines that substancesbelonging to identical classes more often are more similar to eachother.
 8. A database retrieval system according to claim 4 , whereinsaid systematic information retrieval means retrieves all other recordswhich are systematically closer to said designated particular recordwith respect to a threshold value designated in advance.
 9. A databaseretrieval system according to claim 4 , wherein said systematicinformation retrieval means is responsive to said systematic informationretrieval command in which an essential data is designated, forretrieving other records which have said essential data and aresystematically close to said designated record.
 10. A database retrievalsystem for carrying out information retrieval from a database,comprising: systematic information retrieval means responsive to asystematic information retrieval command in which a particular record isdesignated, for retrieving, from said database, other recordssystematically close to said designated particular record.
 11. Adatabase retrieval system according to claim 10 , wherein saidsystematic information retrieval means determines degrees of similaritybetween individual records, and converts said degrees of similarity todegrees of closeness on a one-dimensional coordinate system to therebydetermine that as a record is closer to another record on saidone-dimensional coordinate system, said record is systematically closerto said another record.
 12. A database retrieval system according toclaim 11 , wherein when said database stores information of values ofphysical properties of substances, said systematic information retrievalmeans classifies said substances according to various criteria, anddetermines that substances more often belonging to identical classes aremore similar to each other.
 13. A database retrieval system according toclaim 10 , wherein said systematic information retrieval means retrievesall other records which are systematically closer to said designatedparticular record with respect to a threshold value designated inadvance.
 14. A database retrieval system according to claim 10 , whereinsaid systematic information retrieval means is responsive to saidsystematic information retrieval command in which an essential data isdesignated, for retrieving other records which have said essential dataand are systematically close to said designated record.
 15. Acomputer-readable storage medium storing a program for retrievinginformation from a plurality of databases, said program controlling acomputer to function as integrated information retrieval meansresponsive to retrieval conditions input, for integrating dataseparately added to identical records in a plurality of databases andretrieving records matching said retrieval conditions based onintegrated information of said data.
 16. A computer-readable storagemedium storing a program for retrieving information from a database,said program controlling a computer to function as systematicinformation retrieval means responsive to a systematic informationretrieval command in which a particular record is designated, forretrieving, from said database, other records systematically close tosaid designated particular record.